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1 ABSTRACT 

2 It is widely appreciated that short tandem repeat (STR) variation underlies substantial phenotypic 

3 variation in organisms. Some propose that the high mutation rates of STRs in functional genomic 

4 regions facilitate evolutionary adaptation. Despite their high mutation rate, some STRs show 

5 little to no variation in populations. One such STR occurs in the Arabidopsis thaliana gene PFT1 

6 (MED25), where it encodes an interrupted polyglutamine tract. Though the PFT1 STR is large 

7 (-270 bp), and thus expected to be extremely variable, it shows only minuscule variation across 

8 A. thaliana strains. We hypothesized that the PFT1 STR is under selective constraint, due to 

9 previously undescribed roles in PFT1 function. We investigated this hypothesis using plants 

10 expressing transgenic PFT1 constructs with either an endogenous STR or with synthetic STRs of 

1 1 varying length. Transgenic plants carrying the endogenous PFT1 STR generally performed best 

12 across adult PFT1 -dependent traits, in terms of complementing a pftl null mutant. In stark 

13 contrast, transgenic plants carrying a PFT1 transgene lacking the STR entirely phenocopied a 

14 pftl loss-of-function mutant for flowering time phenotypes, and were generally hypomorphic for 

15 other traits, establishing the functional importance of this domain. Transgenic plants carrying 

16 various synthetic constructs occupied the phenotypic space between wild-type and pftl-loss-of- 

17 function mutants. By varying PFT1 STR length, we discovered that PFT1 can act as either an 

18 activator or repressor of flowering in a photoperiod-dependent manner. We conclude that the 

19 PFT1 STR is constrained to its approximate wild- type length by its various functional 

20 requirements. Our study implies that there is strong selection on STRs not only to generate allelic 

21 diversity, but also to maintain certain lengths pursuant to optimal molecular function. 
22 

23 INTRODUCTION 
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1 Short tandem repeats (STRs, microsatellites) are ubiquitous and unstable genomic elements that 

2 have extremely high mutation rates (Subramanian et al. 2003; Legendre et at 2007; Eckert and 

3 Hile 2009), leading to STR copy number variation within populations. STR variation in coding 

4 and regulatory regions can have significant phenotypic consequences (Gemayel et al. 2010). For 

5 example, several devastating human diseases, including Huntington's disease and 

6 spinocerebellar ataxias, are caused by expanded STR alleles (Hannan 2010). However, STR 

7 variation can also confer beneficial phenotypic variation and may facilitate adaptation to new 

8 environments (Fondon et al. 2008; Gemayel et al. 2010). For example, in Saccharomyces 

9 cerevisiae natural polyQ variation in the FLOl protein underlies variation in flocculation, which 

10 is important for stress resistance and biofilm formation in yeasts (Verstrepen et al. 2005). Natural 

11 STR variants of the Arabidopsis thaliana gene ELF3, which encode variable polyQ tracts, can 

12 phenocopy elf 3 loss-of-function phenotypes in a common reference background (Undurraga et 

13 al. 2012). Moreover, the phenotypic effects of ELF 3 STR variants differed dramatically between 

14 the divergent backgrounds Col and Ws, consistent with the existence of background- specific 

15 modifiers. Genetic incompatibilities involving variation in several other STRs have been 

16 described in plants, flies, and fish (Peixoto et al. 1998; Scarpino et al. 2013; Rosas et al. 2014). 

17 Taken together, these observations argue that STR variation underlies substantial phenotypic 

18 variation, and may also underlie some genetic incompatibilities. 

19 The A. thaliana gene PHYTOCHROME AND FLOWERING TIME 1 (PFT1, MEDIATOR 

20 25, MED25) contains an STR of unknown function. In contrast to the comparatively short and 

21 pure ELF3 STR, the PFT1 STR encodes a long (-90 amino acids in PFT1, vs. 7-29 for ELF3), 

22 periodically interrupted polyQ tract. The far greater length of the PFT1 STR leads to the 

23 prediction that its allelic variation should be greater than that of the highly variable ELF3 STR 
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1 (Legendre et al. 2007, http://wwwigsxnrs-mrs.fr/TandemRepeat/Plant/index.php). However, in 

2 a set of diverse A. thaliana strains, PFT1 STR variation was negligible compared to that of the 

3 ELF3 STR (Supp. Table 1). Also, unlike ELF3, the PFT1 polyQ is conserved in plants as distant 

4 as rice, though its purity decreases with increasing evolutionary distance from A. thaliana. A 

5 glutamine-rich C-terminus is conserved even in metazoan MED25 (File SI). Recent studies of 

6 coding STRs suggested that there may be different classes of STR. Specifically, tandem repeats 

7 that are conserved across large evolutionary distances appear in genes with substantially different 

8 functions than those coding tandem repeats that are not strongly conserved in species (Schaper et 

9 al. 2014). Consequently, PFT1/MED25 polyQ conservation may functionally differentiate the 

10 PFT1 STR from the ELF3 STR. 

1 1 PFT1 encodes a subunit of Mediator, a conserved multi-subunit complex that acts as a 

12 molecular bridge between enhancer-bound transcriptional regulators and RNA polymerase II to 

13 initiate transcription (Backstrom et al. 2007; Conaway and Conaway 2011). PFT1IMED25 is 

14 shared across multicellular organisms but absent in yeast. In A. thaliana, the PFT1 protein binds 

15 to at least 19 different transcription factors (Elfving et al. 2011; Ou et al. 2011; Cevik et al. 

16 2012; Chen et al. 2012) and has known roles in regulating a diverse set of processes such as 

17 organ size determination (Xu and Li 2011), ROS signaling in roots (Sundaravelpandian et al. 

18 2013), biotic and abiotic stress (Elfving et al. 2011; Kidd et al. 2009; Chen et al. 2012), phyB- 

19 mediated- light signaling, shade avoidance and flowering (Cerdan and Chory 2003; Wollenberg 

20 et al. 2008; Inigo, Alvarez, et al. 2012; Klose et al. 2012). 

21 PFT1 was initially identified as a nuclear protein that negatively regulates the phyB 

22 pathway to promote flowering in response to specific light conditions (Cerdan and Chory 2003; 

23 Wollenberg et al. 2008). Recently, Inigo and colleagues (2012) showed that PFT1 activates 
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1 CONSTANS (CO) transcription and FLOWERING LOCUS T (FT) transcription in a CO- 

2 independent manner. Specifically, proteasome-dependent degradation of PFT1 is required to 

3 activate FT transcription and to promote flowering (ffiigo, Giraldez, et at 2012). The wide range 

4 of PFT1 -dependent phenotypes is unsurprising given its function in transcription initiation, yet it 

5 remains poorly understood how PFT1 integrates these many signaling pathways. 

6 Given the conservation of the PFT1 polyQ tract and the known propensity of polyQ tracts 

7 for protein-protein and protein-DNA interactions (Escher et at 2000; Schaefer et at 2012), we 

8 hypothesized that this polyQ tract plays a role in the integration of multiple signaling pathways 

9 and is hence functionally constrained in length. We tested this hypothesis by generating 

10 transgenic lines expressing PFT1 with STRs of variable length and evaluating these lines for 

11 several PFT1 -dependent developmental phenotypes. We show that the PFT1 STR is crucial for 

12 PFT1 function, and that PFT1 -dependent phenotypes vary significantly with the length of the 

13 PFT1 STR. Specifically, the endogenous STR allele performed best for complementing the 

14 flowering and shade avoidance defects of the pftl-2 null mutant, though not for early seedling 

15 phenotypes. Our data indicate that most assayed PFT1 -dependent phenotypes require a 

16 permissive PFT1 STR length. Taken together, our results suggest that the natural PFT1 STR 

17 length is constrained by the requirement of integrating multiple signaling pathways to determine 

18 diverse adult phenotypes. 
19 

20 RESULTS 

21 We used amplification fragment length polymorphism analysis and Sanger sequencing to 

22 evaluate our expectation of high PFT1 STR variation across A. thaliana strains. However, we 

23 observed only three alleles of very similar size (encoding 88, 89 and 90 amino acids, Table SI), 
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1 in contrast to six different alleles of the much shorter ELF3 STR among these strains, some of 

2 which are three times the length of the reference allele (Undurraga et al. 2012). These data 

3 implied that the PFT1 and ELF3 STRs respond to different selective pressures. In coding STRs, 

4 high variation has been associated with positive selection (Laidlaw et al. 2007), though some 

5 basal level of neutral variation is expected due to the high mutation rate of STRs. We 

6 hypothesized that the PFT1 STR was constrained to this particular length by PFTFs functional 

7 requirements. 

8 To test this hypothesis, we generated transgenic A. thaliana carrying PFT1 transgenes 

9 with various STR lengths in an isogenic pftl-2 mutant background. These transgenics included 

10 an empty vector control (VC), OR, 0.34R, 0.5R, .75R, 1R (endogenous PFT1 STR allele), 1.27R, 

1 1 and 1.5R constructs. All STRs are given as their approximate proportion of WT STR length - for 

12 instance, the 1R transgenic line contains the WT STR allele in the pftl-2 background (Table S2). 

13 We used expression analysis to select transgenic lines with similar PFT1 expression levels 

14 (Table S3). 
15 

16 The PFT1 STR length is essential for wild-type flowering and shade avoidance: We first 

17 evaluated the functionality of the different transgenic lines in flowering phenotypes. Removing 

18 the STR entirely substantially delayed flowering under long days (LD, phenotypes days to 

19 flower, rosette leaf number at flowering; Figure 1A). In LD, any STR allele other than OR was 

20 able to rescue the pftl-2 late-flowering phenotype. Indeed, one allele (1.5R) showed earlier 

21 flowering than WT (Figure IB, 1C), whereas other alleles provided a complete or nearly 

22 complete rescue of the pftl-2 mutant (Figure ID). 
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1 In short days (SD), we observed an unexpected reversal in rosette leaf phenotypes 

2 (compare SD and LD rosette leaves, Figures IB, ID). Rather than flowering late (adding more 

3 leaves) as in LD, the loss-of-function pftl-2 mutant appeared to flower early (fewer leaves at 

4 onset of flowering). Only the endogenous STR (1R) fully rescued this unexpected phenotype 

5 (Figure ID). We observed the same mean trend for days to flowering in SD, although differences 

6 were not statistically significant, even for pftl-2 (Figure ID). This discrepancy may be due to 

7 insufficient power, or to a physiological decoupling of number of rosette leaves at flowering and 

8 days to flowering phenotypes in pftl-2 under SD conditions. Regardless, our results indicate that 

9 pftl-2' 's late-flowering phenotype is specific to LD conditions. Our observation of this reversal in 

10 flowering time-related phenotypes appears to contradict previous data (Cerdan and Chory 2003). 

11 However, a closer examination of this data reveals that the previously reported rosette leaf 

12 numbers in SD for the pftl-2 mutant show a similar trend. PFT1 STR length shows an 

13 approximately linear positive relationship with the SD rosette leaf phenotype, forming an allelic 

14 series of phenotypic severity. This allelic series strongly supports our observation of either 

15 slower growth rate (i.e. delayed addition of leaves) or early flowering of pftl-2 as measured by 

16 SD rosette leaves at flowering. 

17 PFT1 genetically interacts with the red/far-red light receptor phyB, which governs petiole 

18 length through the shade avoidance response (Cerdan and Chory 2003; Wollenberg et al. 2008). 

19 We measured petiole length at bolting for plants grown under LD to evaluate the strength of their 

20 shade avoidance response, and thus whether the genetic interaction is affected by repeat length. 

21 Like the flowering time phenotypes, we found that the 1R allele most effectively rescued the 

22 long-petiole phenotype of the pftl-2 null among all STR alleles (Figure 2), though some alleles 

23 (e.g. 1.5R) show a rescue that is nearly as good. 
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In summary, plants expressing the 1R transgene most closely resembled wild- type plants 
across a range of adult phenotypes. In contrast, the other STR alleles showed inconsistent 
performance across these phenotypes, rescuing only some phenotypes or at times out-performing 
wild-type. 
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Figure 1. PFT1 STR alleles differ in their ability to rescue a pftl loss-of -function mutant for 
flowering phenotypes. A, C) Transgenic plants carrying different PFT1 STR alleles. Plants 
were grown under LD for 31 days and photographed. Background was removed in Adobe 
Photoshop CS 6.0. B, D) Strains sharing letters are not significantly different by Tukey's HSD 
test. Black lines represent WT means, red lines represents pftl -2 means for each phenotype. Each 
STR allele is represented by at least two independent transgenic lines (Table S3), with N>20 for 
SD phenotypes and N>35 for LD phenotypes, a = 0.05. LD=long days, SD=short days. In SD 
flowering time (days), no groups are significantly different. 
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2 Figure 2. STR alleles differ in their ability to rescue a /j/fi loss-of -function mutant for 

3 petiole length in long days. Strains sharing letters are not significantly different by ANOVA 

4 with Tukey's HSD test. Black lines represent WT means, red lines represent pftl-2 means for 

5 each phenotype. Each allele is represented by at least two independent transgenic lines, N>35 for 

6 each allele, a = 0.05. 
7 

8 PFT1 STR alleles fail to rescue early seedling phenotypes: We next assessed quantitative 

9 phenotypes in early seedling development, some of which had been previously connected to 

10 PFT1 function. Specifically, we measured hypocotyl and root length of dark-grown seedlings 

11 and examined germination in the presence of salt (known to be defective in pftl mutants) 

12 (Elfving et at 2011). The pftl-2 mutant showed the previously reported effect on hypocotyl 

13 length as well as a novel defect in root length (Figure 3A). None of the transgenic lines, 

14 including the one containing the 1R allele, effectively rescued these pftl-2 phenotypes (Figure 

15 3A). Similarly, 1R was not able to rescue the germination defect of pftl-2 on high-salt media. 

16 However, both the 1.5R and 0.5R alleles were able to rescue this phenotype (Figure 3B). In 

17 summary, no single STR allele, including the endogenous 1R, was consistently able to rescue the 

18 early seedling phenotypes of the pftl-2 mutant. One explanation for the failure of the 

19 endogenous STR (PFT1-1R) to rescue early seedling phenotypes is that the PFT1 transgene 

20 represents only the larger of two splice forms. The smaller PFT1 splice form, which we did not 

21 test, may play a more important role in early seedling development. To explore this hypothesis, 
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1 we measured mRNA levels of the two splice forms in pooled 7-day seedlings grown under the 

2 tested conditions and various adult tissues at flowering in Col-0 plants. However, we found that 

3 both splice forms were expressed in all samples, and in all samples the larger splice form was the 

4 predominant form (data not shown). The possibility remains that downstream regulation or 

5 tissue- specific expression may lead to a requirement for the smaller splice form in early 

6 seedlings. 



WT 
1 

1.5 
1.2 

0.75 
0.5 

0.34 
0 

Mutant 



oh|>- 

» Qflf" 



■II] 

in 
m 
-m 

-w 



~!> Vo 1 



H ° a 
_| be 

■ - H " 
-|= d 

H cde 

-H<> be 
-H I 

- H " bed 




5 20 



B 



? 1'0 1'5 20 2'5 

Hypocotyl (mm) 



100 



c 80 
a 

c 
E 



60 



e 

I 4o^ 

EL 



n 



Root (mm) 

ilililii 



i 



20- 



WT 1R 1.5R12R.75R 5R 34R OR VC 

7 Line 

8 Figure 3. PFT1 STR alleles differ in their ability to rescue a pftl loss-of -function mutant for 

9 early seedling phenotypes. A) Strains sharing letters are not significantly different by ANOVA 

10 with Tukey's HSD test. Black lines represent WT means, red lines represent pftl -2 means for 

1 1 each phenotype. Each allele is represented by at least two independent transgenic lines, N>100 

12 for all phenotypes for each allele, pooled across at least two experiments; a = 0.05. Hypocotyl 

13 length and root length were assayed in 7d seedlings grown in dark conditions. B) Dark and light 

14 bars represent mean germination across 3 biological replicates on 0 mM NaCl and 200 mM 



11 



Downloaded from http://biorxiv.org/on September 18, 2014 

1 NaCl, respectively. N = 36 for each replicate experiment. Error bars represent standard error 

2 across these three replicates. 
3 

4 Summarizing PFT1 STR function across all tested pheno types: Given the complex 

5 phenotypic responses to PFT1 STR substitutions, results were equivocal as to which STR allele 

6 demonstrated the most 'wild-type-like' phenotype across traits, as measured by its sufficiency in 

7 rescuing pftl-2 null phenotypes. To summarize the various phenotypes, we calculated the mean 

8 of each quantitative phenotype for each allele, and used principal component analysis (PCA) to 

9 visualize the joint distribution of phenotypes observed. 

10 All STR alleles were distributed between the pftl-2 null and wild-type (WT) in PCI, 

11 which was strongly associated with adult traits and represented a majority of phenotypic 

12 variation among lines (Figure 4). PCI showed that 1R was the most generally efficacious allele 

13 for adult phenotypes. However, 1R showed incomplete rescue in early seedling phenotypes such 

14 as hypocotyl length, which drove PC2. All STR alleles showed substantial rescue in adult 

15 phenotypes, and even the OR allele without an STR showed some partial rescue in some 

16 phenotypes; however, rescue of early seedling phenotypes was generally poor for all alleles. The 

17 first principal component also captured our observation that the pftl-2 flowering defect reversed 

18 sign in SD vs. LD: according to Figure 4, SD and LD quantitative phenotypes are both strongly 

19 represented on principal component 1, but they show opposite directionality. We take this 

20 observation as support of this hitherto-unknown complexity in PFT1 function. 
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2 Figure 4. Distribution of PFT1 STR allele performance across all phenotypes, relative to 

3 wild-type and pftl-2 mutants. Biplot representation of PCA on all phenotypes across all tested 

4 PFT1 STR alleles. Percentages on axes are the % variance in the overall data contributed by that 

5 principal component. Contributions of specific phenotypes to these axes are shown by size and 

6 direction of arrows. Red arrows represent adult phenotypes, blue arrows represent early seedling 

7 phenotypes; adult phenotypes are in red, whereas early seedling phenotypes are in blue. 

8 "RosetteSD": number of rosette leaves under SD, "RosetteLD": number of rosette leaves under 

9 LD, "LongestLeafPL": petiole length of the longest leaf of rosette, "GermNaCI": proportion of 

10 germinants on 200 mM NaCl, "Hypocotyl" and "Root" refer to lengths of the specified organs in 

11 dark-grown seedlings. Transgenic STR alleles are indicated by their proportion of the wild-type 

12 (WT) repeat, i.e. "1.5R". Top and right axes provide a relative scale for the magnitude of 

13 phenotype vectors (blue and red arrows). 
14 

15 DISCUSSION 

16 STR-containing proteins pose an intriguing puzzle -they are prone to in-frame mutations, which 

17 in many instances lead to dramatic phenotypic changes (Gemayel et al. 2010). Although STR- 

18 dependent variation has been linked to adaptation in a few cases, the presence of mutationally 

19 labile STRs in functionally important core components of cell biology seems counterintuitive. 

20 PFT1, also known as MED25, is a core component of the transcriptional machinery across 

21 eukaryotes and contains an STR that is predicted to be highly variable in length. Contrary to this 

22 prediction, we found PFT1 STR variation to be minimal, consistent with substantial functional 
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1 constraint. The existing residual variation (-2% of reference STR length, as opposed to >100% 

2 for the ELF3 STR in the same A. thaliana strains) suggests that the PFT1 STR is mutationally 

3 labile like other STRs. In fact, several of the synthetic PFT1 alleles examined in this study arose 

4 spontaneously during cloning. Strong functional constraint, however, may select against such 

5 deviations in STR length in planta. 

6 Here, we establish the essentiality of the full-length PFT1 STR and its encoded polyQ 

7 tract for proper PFT1 function in A. thaliana. We found that diverse developmental phenotypes 

8 were altered by the substitution of alternative STR lengths for the endogenous length. 

9 Leveraging the support of the PFT1 STR allelic series, we report new aspects of PFT1 function 
10 in flowering time and root development. 

11 

12 The PFT1 STR is required for PFT1 function in adult traits: The PFT1 OR lines did not 

13 effectively complement pftl-2 for adult phenotypes, suggesting a crucial role of the PFT1 STR 

14 in regulating the onset of flowering and shade avoidance. Generally, PFT1-YR. was most 

15 effective in producing wild- type- like adult phenotypes. The precise length of the STR, however, 

16 seemed less important for the onset of flowering in LD. With exception of PFT1 -OR, all other 

17 STR alleles were also able to rescue the loss-of-function mutant to some extent, suggesting that 

18 as long as some repeat sequence is present, the PFT1 gene product can fulfill this function. 

19 Under other conditions, and for other adult phenotypes, requirements for PFT1 STR length 

20 appeared more stringent. Specifically, under SD, the rosette leaf number phenotype of the pftl-2 

21 mutant can only be rescued by PFT1 -1R, while STR alleles perform worse with increasing 

22 distance from this length "optimum". 
23 
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1 pftl-2 mutants are late-flowering in LD but not SD: pftl-2 plants had fewer rosette leaves at 

2 flowering in SD, but more rosette leaves in LD, consistent with previous, largely undiscussed 

3 observations (Cerdan and Chory 2003). Under LD conditions, pftl-2 null mutants flowered late, 

4 as described in several previous studies (Cerdan and Chory 2003; Wollenberg et al. 2008), but 

5 we observe no such phenotype under SD conditions, contradicting at least one prior study 

6 (Cerdan and Chory 2003). These data suggest that while PFT1 functions as a flowering activator 

7 under LD, its role is more complex under SD. 

8 One recent study showed that PFT1 function in LD is dependent upon its ability to bind 

9 E3 ubiquitin li gases (ffiigo, Giraldez, et al. 2012). Inhibition of proteasome activity also prevents 

10 PFT1 from promoting FT transcription and thus inducing flowering, suggesting that degradation 

11 of PFT1 or associated proteins is a critical feature of PFTPs transcriptional activation of 

12 flowering in LD. If this degradation is somehow down-regulated in SD, PFT1 could switch from 

13 a flowering activator to a repressor, through decreased Mediator complex turnover at promoters. 

14 Recent studies raised the possibility that different PFT1 -dependent signaling cascades have 

15 different requirements for PFT1 turnover (Ou et al. 2011; Kidd et al. 2009), which may 

16 contribute to the condition- specific PFT1 flowering phenotype we observe. Conservatively, we 

17 conclude that the regulatory process that mediates the phenotypic reversal between LD and SD 

18 depends on the endogenous PFT1 STR allele, suggesting that the polyQ is crucial to PFTl's 

19 activity as both activator and potentially as a repressor of flowering. 
20 

21 

22 Incomplete complementation of germination and hypocotyl length by the PFT1 constructs: 

23 Whereas pftl-2 adult phenotypes were rescued by the PFT1-IR allele, most of our transgenic 
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1 lines could not fully rescue pftl-2 early seedling phenotypes of 1) germination under salt, 2) 

2 hypocotyl length, and 3) root length. The PFT1 gene is predicted to have two different splice 

3 forms, the larger of which was used to generate our constructs (both splice forms contain the 

4 STR). Several studies have shown that, under stress conditions, different splice forms of the 

5 same gene can play distinct roles (Yan et at 2012; Leviatan et al. 2013; Staiger and Brown 

6 2013). We note that the conditions under which PFT1-YR fails to complement are also 

7 potentially stressful conditions (artificial media, sucrose, high salt, dark). The shorter splice form 

8 of PFT1 may be required in signaling pathways triggered under stress conditions. We presume 

9 that the failure to complement results from a deficiency related to this missing splice form. 

10 However, hypocotyl length was the only trait in which all examined STR alleles resembled the 

11 pftl-2 mutant. The significant functional differentiation among the STR alleles for root length 

12 and germination suggests that the large splice form does retain at least some function in early 

13 seedling traits. 
14 

15 

1 6 Implications for STR and PFT1 biology: 

17 Coding and regulatory STRs have been previously studied and discussed as a means of 

18 facilitating evolutionary innovation (Verstrepen et al. 2005). However, this means of innovation 

19 is based upon the same sequence characteristics that promote protein-protein and protein-DNA 

20 binding (Escher et al. 2000; Schaefer et al. 2012), such that STR variability must be balanced 

21 against functional constraints. This balance has recently been described for a set of 18 coding 

22 dinucleotide STRs in humans, which are maintained by natural selection even though any 

23 mutation is likely to cause frame-shift mutations (Haasl and Payseur 2014). These results, 
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1 coupled with our observations, lend credence to these authors' previous argument that not all 

2 STRs act as agents of adaptive change (Haasl and Payseur 2013). Considering again the 

3 possibility that more conserved coding tandem repeats have distinct functions from non- 

4 conserved tandem repeats (Schaper et al. 2014), we suggest that PFT1 and ELF 3 can serve as 

5 models for these two selective regimes, and that the structural roles of their respective polyQs 

6 underlie the differences in natural variation between the two. In some cases, such as ELF3, high 

7 variability is not always inconsistent with function, even while holding genetic background 

8 constant (Undurraga et al. 2012). In PFT1, we have identified a STR whose low variability 

9 reflects strong functional constraints. We speculate that these constraints are associated with a 

10 structural role for the PFT1 polyQ in the Mediator complex, either in protein-protein interactions 

11 with other subunits or in protein-DNA interactions with target promoters. Given that a 

12 glutamine-rich C-terminus appears to be a conserved feature of MED25 even in metazoans (File 

13 SI), we expect that our results are generalizable to Mediator function wherever this protein is 

14 present. Future work will be necessary in understanding possible mechanisms by which the 

15 MED25 polyQ might facilitate Mediator complex function and contribute to ontogeny 

16 throughout life. Moreover, attempts must be made to understand the biological and structural 

17 characteristics unique to polyQ-containing proteins that tolerate (or encourage) polyQ variation, 

18 as opposed to those polyQ-containing proteins (like PFT1) that are under strong functional 

19 constraints. 
20 

21 METHODS 

22 Cloning: A 1000 bp region directly upstream of the PFT1 coding region was amplified and 

23 cloned into the pBGW gateway vector (Karimi et al. 2002) to create the entry vector pBGW- 
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1 PFTlp. A full-length PFT1 cDNA clone, BX8 16858, was obtained from the French Plant 

2 Genomic Resources Center (rNRA, CNRGV), and used as the starting material for all our 

3 constructs. The PFT1 gene was cloned into the pENTR4 gateway vector (Invitrogen) and the 

4 repeat region was modified by site-directed mutagenesis with QuikChange (Agilent 

5 Technologies), followed by restriction digestions and ligations. The modified PFT1 alleles were 

6 finally transferred to the pBGW-PF77p vector via recombination using LR clonase (Invitrogen) 

7 to yield the final expression vectors. Seven constructs expressing various polyQ lengths (Table 

8 S2), plus an empty vector control, were used to transform homozygous pftl-2 mutants by the 

9 floral dip method (Clough and Bent 1998). Putative transgenics were selected for herbicide 

10 resistance with Basta (Liberty herbicide; Bayer Crop Science) and the presence of the transgene 

11 was confirmed by PCR analysis. Homozygous T 3 and T 4 plants with relative PFT1 expression 

12 levels between 0.5 and 4 times the expression of Col-0 were utilized for all experiments 

13 described. A minimum of two independent lines per construct was used for all experiments. 
14 

15 Expression Analysis: All protocols were performed according to manufacture's 

16 recommendations unless otherwise noted. Total RNA was extracted from 30mg of 10-days-old 

17 seedlings with the Promega SV Total RNA Isolation System (Promega). 2 |ig of total RNA were 

18 subjected to an exhaustive DNasel treatment using the Ambion Turbo DNA-free Kit (Life 

19 Technologies). cDNA was synthesized from 100-300 ng of DNase-treated RNA samples with 

20 the Roche Transcriptor First Strand cDNA Synthesis Kit (Roche). Quantitative Real-Time PCR 

21 was performed in a LightCyler® 480 system (Roche) using the 480 DNA SYBR Green I Master 

22 kit. Three technical replicates were done for each sample. RT-PCR was performed under the 

23 following conditions: 5 min at 95 °C, followed by 35 cycles of 15 s at 95 °C, 20 s at 55 °C, and 
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1 20s at 72 °C. After amplification, a melting-curve analysis was performed. Expression of UBC21 

2 (At5g25760) was measured as a reference in each sample, and used to calculate relative PFT1 

3 expression. All expression values were normalized relative to WT expression, which was always 

4 set to 1 .0. To measure splice forms, the protocol was the same but reactions were carried out in a 

5 standard thermal cycler and visualized on 2% agarose stained with ethidium bromide. For 

6 primers, see Table S4. 
7 

8 Plant Materials and Growth Conditions: Homozygous plants for the T-DNA insertional 

9 mutant SALK_1 29555, pftl-2, were isolated by PCR analysis from an F2 population obtained 

10 from the Arabidopsis Stock Center (ABRC) (Alonso et al. 2003). Plants were genotyped with the 

11 T-DNA specific primer LBbl (http://signal.salk.edu/tdna_FAQs.html) and gene-specific primers 

12 (Table S4). 

13 Seeds were stratified at 4°C for 3 days prior to shifting to the designated growth 

14 conditions, with the shift day considered day 0. For flowering time experiments, plants were 

15 seeded using a randomized design with 15-20 replicates per line in 4x9 pot trays. Trays were 

16 rotated 180° and one position clockwise everyday in order to further reduce any possible position 

17 effect. Plants for LD were grown in 16 hours of light and 8 hours of darkness per 24 hour period. 

18 Bolting was called once the stem reached 1 cm in height. 

19 Full strength MS media containing MES, vitamins, 1% sucrose, and 0.24% phytagar was used 

20 for hypocotyl experiments. For germination experiments, half-strength MS media was used, 

21 supplemented with 1% sucrose, 0.5 g/L MES, and 2.4 g/L phytagel containing 200 mM NaCl or 

22 H2O mock treatment with the pH adjusted to 5.7. All media was sterilized by autoc laving with 30 

23 minutes of sterilization time. Seeds for tissue culture were surface sterilized with ethanol 
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1 treatment prior to plating and left at 4°C for 3 days prior to shifting to the designated growth 

2 conditions. Plants for hypocotyl experiments were grown with 16 hours at 22°C and 8 hours at 

3 20°C in continuous darkness following an initial 2 hour exposure to light in order to induce 

4 germination. Germination experiments were scored on day 4 under LD at 20-22°C. ImageJ 

5 software was utilized to make all hypocotyl and root length measurements. Raw phenotypic data 

6 are included as File S3. 
7 

8 Statistical Analysis: All statistical analyses and plots were performed in R version 2.15.1 with a 

9 = 0.05 (R Development Core Team 2012). Phenotypic data were analyzed using the analysis of 

10 variance (ANOVA), followed by Tukey's HSD tests for the differences of groups within the 

11 ANOVA. Tukey's HSD is a standard post-hoc test for multiple comparisons of the means of 

12 groups with homogeneous variance that corrects for the number of comparisons performed. 

13 Principal component analysis was performed using the prcomp() function after scaling each 

14 phenotypic variable to mean=0 and variance=l across lines (phenotypes are not measured on the 

15 same quantitative scale; for example, SD flowering time ranges from 80 to 140 days, whereas 

16 LD rosette leaves ranges -5-15 leaves). 
17 

18 Sequence Analysis: Length of ELF 3 and PFT1 STRs were determined by Sanger (dideoxy) 

19 sequencing. Raw sequencing data are included as File S2. PFT1 and MED25 reference amino 

20 acid sequences were obtained from KEGG (Ogata et al. 1999) and aligned with Clustal Omega 

21 vl .0.3 with default options (Sievers et al. 201 1). 
22 
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